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SUMMARY 


The advantages and limitations of using computer-animated stimuli in studying 
motion perception are presented and discussed. Most current programs of motion- 
perception research could not be pursued without the use of computer graphics anima- 
tion. Computer-generated displays afford latitudes of freedom and control that are 
almost impossible to attain through conventional methods. There are, however, 
limitations to this presentational medium. At present, computer-generated displays 
present simplified approximations of the dynamics in natural events. We know very 
little about how the differences between natural events and computer simulations 
influence perceptual processing. In practice, we tend to assume that the differ- 
ences are irrelevant to the questions under study, and that findings with computer- 
generated stimuli will generalize to natural events. 


INTRODUCTION 


This paper is divided into two parts. In the first, we discuss some of the 
many advantages of employing computer-graphics animation in motion-perception 
research. In the second, we discuss some of the limitations inherent to this pres- 
entational medium. We suggest that, although many research programs could not 
possibly be pursued today without computer animation, existing graphics displays 
never model perfectly the environmental dynamics that they are intended to simu- 
late. Little is known about how these differences may influence perceptual 
processing. 


This research was supported by grants from NASA, NCA2-87; NICHD, HD-16195; and 
the Virginia Center for Innovative Technology, INF-85-014. Stephen Ellis, Scott 
Fisher, David Gilden, Jeffrey Lande, and Susan Whelan provided valuable criticism on 
an earlier version of this paper. 
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ADVANTAGES OF COMPUTER GENERATED STIMULI 


Much of the progress in motion-perception research was made possible by the 
development of computer-graphcs animation technologies. Johansson (1950) set the 
course for subsequent research with studies employing dynamic point-light stimuli 
displayed on an oscilloscope. Johansson's seminal work not only influenced our 
conceptual understanding of the role that motion plays in organizing the visual 
world, but also established the directions of future research methodology. 

Much of the work in motion perception has focused on identifying minimal condi- 
tions for perceiving various environmental properties. Consider, for example, 
investigations of the perception of form (3-D structure) from motion information. 
Probably the most dramatic demonstrations that motion is a minimal condition for 
perceiving form are the kinetic depth effect demonstrations (Wallach and O'Connell, 
1953/1976) and point-light walker displays (Johansson, 1973). In the original 
kinetic depth effect demonstrations, shadows of unfamiliar wire forms were projected 
onto a screen. Without motion, these shadows appeared as 2-D configurations of 
lines; however, when the wire forms were rotated, their 3-D structure was immedi- 
ately evident. Johnsson's point-light walker studies were made by attaching small 
lights to the joints of people and filming them as they walked in the dark. As with 
the kinetic depth effect, static frames from these displays appeared as meaningless 
2-D arrays of dots; however, a very brief viewing of a moving sequence allowed the 
observer to identify the projection as a locomoting person. 

More recently, researchers have increasing turned to the use of computer- 
generated dynamic displays. Braunstein (1976) developed a computer-based methodol- 
ogy for creating complex kinetic depth effect displays consisting of point-lights. 
Cutting (1978) created a general program for generating point-light walkers. 
Bertenthal, Proffitt, and Keller (1985) wrote a very general animation program for 
the Apple microcomputer, interfaced to a Texas Instruments TMS 9918A video display 
processor, that allows one to create point-light projections (limit is 32 points) of 
rigid or jointed objects capable of all 6 degrees-of- freedom movement. (All of the 
demonstrations in Johansson's (1950) book can be easily re-created with this 
program. ) 

There is now a long list of research topics in which dynamic computer-generated 
displays are used. This list includes studies on perceiving ego (self) motion, 
texture segregation, form, form change, object displacement, and dynamics (the 
recovery of mass and force information from kinematics). It has also been found 
that infants as young as 3 months of age will attend with interest to computer- 
generated point-light displays and can extract some structure from them (Bertenthal, 
Proffitt, and Cutting, 1984). A recent issue of Perception (1985), devoted to 
motion perception, contained a wide variety of research reports employing computer 
animation. This issue even provided an Apple disk which allowed the reader to view 
many of the dynamic stimuli discussed. 

Computer animation has numerous advantages over other techniques for creating 
minimal motion displays. Programmed displays are far more flexible and easier to 
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create than are physical mechanisms which are constructed to produce the desired 
motions. Computer displays can be easily programmed to display violations in nat- 
ural dynamics (events in which the laws of physics are violated). This capability 
proves to be very useful in assessing visual sensitivities to natural dynamics and 
is, of course, extremely difficult to achieve using real objects. Finally, 
computer-gdnerated stimuli provide the researcher with exact knowledge of the 
display's parameters. 

An example of the importance of such tractability can be seen in research on 
perceiving point-light walkers. Proffitt, Bertenthal, and Roberts (1984) created a 
computer display containing all of the information previously thought to be effec- 
tive in the perception of Johansson's (1973) naturally produced point-light 
walker. After 1-1/2 min of viewing, only about a third of their subjects recognized 
that this display could be seen as a projection of a person walking. We still do 
not know all of the parameters of information that people use when extracting the 
human form from Johansson's original, naturally produced, point-light walker dis- 
plays. Computer simulations can, in cases such as this, serve as empirical tests of 
processing models. 


LIMITATIONS OF COMPUTER GENERATED STIMULI 


Dual Awareness 

As with static pictures, viewing computer-animated displays gives the observer 
a dual awareness: (1) A transforming 2-D pattern appearing on the terminal screen, 

and (2) the 3-D event that is being simulated. Gibson (1979) argued that this dual 
awareness was one of the aspects of picture perception that made it difficult to 
generalize from research employing pictures to the perception of real objects and 
events. One of the important properties that is absent in pictures is, of course, 
motion; however, the substitution of motion for pictorial cues does not necessarily 
make dynamic computer displays more ecologically valid. 

Whenever people look at computer-animated displays, they are presented with 
conflicting information about depth relationships. All of the primary depth cues 
specify that the transforming projections are 2-D. Moreover, unless the displays' 
motions are yoked to the head movements of the observer (Rogers and Graham, 1979), 
the absence of motion parallax will further define 2-D aspects of the display. At 
odds with this information are the displays' dynamics specifying 3-D structures. 

What is the influence of this dual awareness on the perceptual processing of 
computer-animated displays? Our own research suggests that the ability to extract 
dynamic information from motion displays is related to the degree of naturalness 
found in the simulation (Kaiser, Proffitt, and Anderson, 1985). We suspect that too 
great a reliance on dynamic computer displays may result in an underestimation of 
people's sensitivities to motion-specified information. 
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Scaling Perspective, Depth, and Size 


A perceived object in a computer-animated display does not appear to be located 
on the monitor's screen; rather it appears to be somewhere behind the screen. This 
indeterminacy of absolute depth (observer-relative depth) creates a set of difficult 
problems in programming a naturally appearing simulation. 

To generate a perspective projection of a rigid object undergoing rotation and 
to compute your perspective transformation, one of the parameters that must be 
specified is the distance between the geometric eyepoint and the simulated object. 
(Hagen (1980) is an excellent source of articles on the geometry of perspective 
projection.) Should you assume this distance to be equal to the distance between 
the eye and the screen, you will be surprised as the simulated object will appear to 
deform drastically as it rotates. 

A rigid object will appear to deform as it rotates unless either the perceived 
viewing distance is specified accurately or the perspective transformation is not 
salient. Thus determining appropriate viewing distances becomes an empirical rather 
than a purely geometrical problem. Moreover, perceived viewing distance may vary 
across individuals, and may not remain stable over time within an individual. With 
regard to the salience of perspective transformations, many researchers present 
displays in parallel projection and obtain a few reports of nonrigid motions. 

If perspective information is not given, or if it is not sufficiently salient, 
then depth-order ambiguities arise unless the animation program is capable of hidden 
surface removal. (Depth order refers to whether elements are in front of or behind 
other elements in the display.) Depth-order ambiguities, in turn, can affect the 
motions that are seen. In kinetic depth effect displays, for example, the perceived 
direction of motion will spontaneously reverse unless perspective information or 
occlusion is provided (Braunstein, Anderson, and Riefer, 1982). For point-light 
displays, depth order can be specified by causing some points to disappear and other 
points to remain visible whenever they pass through a particular location. In such 
cases, the points that disappear are seen as being behind those that remain visible, 
and ar invisible, intermediate, occluding surface is perceptually specified 
(Proffitt et al., 1984). 

Since the perceived distance from observer to simulated object cannot be deter- 
mined geometrically, neither can the object's absolute size. If apparent distance 
and size are different than the values intended by the programmer, then the dis- 
play's dynamics may also appear inappropriate or out of scale. Consider, for exam- 
ple, a simulation of a falling object. If the object's perceived size and distance 
are different from the assumed parameters of the program, then the object will 
appear to fall either too fast or too slow. We have successfully specified size and 
distance in such a display by placing a simulated familiar object, such as a person, 
in the same depth plane as the falling object. 
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Size of Visual Field 


Unlike natural scenes, computer displays subtend a limited area of an 
observer's field of view. The better computer graphic systems in today's market 
employ 1024 x 1024 pixel RGB monitors. If an observer views such a monitor from a 
distance of about 80 cm, the display has fairly good resolution (approximately 
40 pixels/deg), but subtends less than 26 deg of visual angle. The observer has the 
clear sensation of viewing a window on a scene rather than the scene itself. 

Attempts to enlarge the visual display entail significant compromises in terms 
of cost, resolution, color capability, or update rate. Given present technology, it 
is extremely costly to employ larger CRT displays and it is unlikely that CRT dis- 
plays will ever become cost-effective for wide f ield-of-view displays. We are 
currently using two 45 in. rear-projection screens in studies of peripheral-motion 
information processing; however, all currently available rear- or front-projection 
systems have several drawbacks, notably lack of resolution, contrast, and bright- 
ness. In addition, projection systems tend to be cumbersome and difficult to adapt 
to all experimental situations. 

Recently, effort has been directed toward the development of head-mounted 
display systems. The advantages of such systems include: they are capable of 

presenting binocular displays; a wide variety of apparent display sizes can be 
produced on a single, small screen (although fairly sophisticated optics are needed 
to produce appropriate geometries); their displays can be yoked to the observer's 
head motions or other monitored activity. In fact, employing head-tracking technol- 
ogies, such a head-mounted display can create a 360 deg stereoscopic visual environ- 
ment (Fisher, Space Station Human Factors Research Review, 1985). The disadvantages 
of such a system include the cost and awkwardness of high-resolution color displays 
and the relatively high hardware and software costs. In addition, head-mounted 
displays may prove inappropriate for some subject populations (e.g., infants) or 
experimental tasks. 

At some point, the size of the visual field becomes constrained, not only by 
display technology, but also by limitations in the computational hardware; the 
system simply cannot compute values for all the pixels in the scene, given the 
required update rate. When this happens, several remedies are possible. First, the 
system can convert from real-time displays to storage of display sequences on some 
random-access storage device (e.g., a laser disc). Alternatively, one can create 
variable resolution displays which compute high-resolution displays only for the 
area surrounding the observer's current fixation point. The latter solution is 
fairly elaborate, but is being pursued in contexts requiring high-resolution real- 
time graphic-animation displays (e.g., flight simulators). A third solution which 
reduces the computational demand is to reduce the update rate slightly, yet retain 
the real-time nature of the display. This solution creates a new set of problems 
which are the next topic of consideration: The quality of motion in computer gener- 

ated displays. 
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Motion Quality 


The quality of simulated motions depends primarily upon two things: (1) the 

particular motion algorithms employed, and (2) whether or not the system possesses 
sufficient computational power to execute the algorithms in time for the next screen 
update. We discuss the issue of motion algorithm adequacy below, and again later, 
under the topic of simulation dynamics. First, we consider the computational -power 
issue. 

As mentioned above, the update rate can be increased if scene resolution (or 
complexity) is reduced. Thus, most researchers are left with a direct trade-off 
between update rate and computational complexity. If there is no need for real-time 
animation, the researcher gains a huge advantage. Complex events can be generated 
one frame at a time, with the ensemble of frames later shown at the desired update 
rate. Braunstein (1976) used this method to generate many of his depth-from-motion 
stimuli. Static images from a motion sequence were created on a computer terminal 
and recorded frame-by-frame on 16 mm movie film. Many expensive systems on the 
market today store the image on video tape or laser disc (the latter having the 
advantage of rapid random-access capabilities). However, all these techniques 
require that the researcher have a well-defined, limited number of sequences to be 
computed. Further, ail but the laser disc technique make it extremely difficult to 
alter the order of sequence presentation as in a response-dependent experimental 
design (e.g M staircase methodologies) . 

If, then, one wishes to generate stimuli in real-time, the trade-off between 
complexity and update rate remains. The third factor in this trade-off is cost: 
the more expensive computer systems have greater computational capabilities and 
software enhancements. Fortunately, the power-cost trade-off continues to become 
more favorable to the consumer. Today's microcomputers are more powerful graphic 
systems than minicomputers of a decade ago. In particular, the Motorola 68000 chip- 
based processors (and their 68010 and 68020 successors) provide impressive perfor- 
mance. In addition, some microcomputers (e.g., the Commodore Amiga) have dedicated 
graphic processing chips. Unfortunately, performance always seems to lag expecta- 
tions, and researchers are likely to lag state-of-the-art performance owing to 
economic and procurement constraints. 

Researchers disagree on what update rate is acceptable for dynamic stimuli and, 
of course, the rate depends on characteristics of the event. The update rate at 
which time sampled motion becomes indistinguishable from smooth motion depends upon 
the velocity and spatial frequency content of the image, as well as observer factors 
(Watson, Ahumada, and Farrell, 1986). Ultimately, update rates will be limited by 
display hardware constraints. Researchers using raster display systons are limited 
by the refresh rate of the monitor. This rate is generally 60 Hz for systems in 
North America, and 50 Hz for systems marketed overseas (although some manufacturers 
use nonstandard rates, e.g., Sun Microsystems have 66 Hz monitors). 

When depicted objects move at high velocities, computer-generated displays 
possess a distinct artifact, most noticeable when comparing a computer -generated 
"frame" with a frame from a movie film of the event. In the film frame, quickly 
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moving objects will be blurred. In the computer-generated frame, however, all 
objects are clearly defined. This leads to a strobing (or temporal aliasing) that 
appears quite unnatural. Effective motion-blur algorithms have been developed to 
solve this problem (e.g., Max and Lerner, 1985). In brief, these algorithms sample 
object location during each frame interval, and distribute object density accord- 
ingly. (Of course, employing such algorithms increases the computational complexity 
of the sequence and, thus far, cannot be performed in real-time.) 


Object Realism 

The problems of object realism are similar to those of motion quality: there 

are both computational power constraints and adequacy of algorithm limitations. The 
two aspects of object realism discussed are surface properties and shading. We also 
limit our discussion to computer-generated objects, excluding those impressive 
computer displays that are simply digitized photographs. 

Real objects in the environment possess complex visual surface qualities. 
Texture and reflectance properties are difficult to model realistically for several 
reasons. First, reflectance properties have been studied for only a limited class 
of materials, with adequate mathematical description developed for fewer still. 
Second, few natural objects have smooth surfaces with constant reflective proper- 
ties; most surfaces in the environment are anisotropic (meaning that reflection is a 
function of orientation). For example, the threads in a weave of cloth will scatter 
light more narrowly in the direction of the thread than they will perpendicularly. 
Although some anisotropic models are being developed (e.g., Kajiya, 1985), such 
surfaces are still quite difficult to simulate. Thus, we find a plethora of smooth, 
regular objects in computer-graphics demonstrations. Finally, texture presents a 
challenge to efficient modeling. One wants to retain the stochastic nature of the 
texture while utilizing a consistent, efficient algorithm. Fractal geometry has 
been employed to this end (Mandelbrot, 1983), and has proved effective for a wide 
class of natural objects (e.g., mountains, trees, clouds). However, fractal models 
are not appropriate for all object classes, and most of the current fractal algo- 
rithms are computationally expensive. 

Realistic shading is difficult to achieve for similar reasons. In fact, the 
two issues are related since, in order to specify ambient light conditions for 
shading, reflectance properties must be known (Nishita and Nakamae, 1985). Con- 
sider, for example, the ray-tracing method of scene generation. In this method, a 
number of rays originate from each pixel and are allowed to propagate through the 
scene, bouncing from surface to surface in accordance with each object's reflectiv- 
ity (Cook, Porter, and Carpenter, 1984). Thus, ray-tracing algorithms must ade- 
quately model interreflection as well as primary lighting sources in order to 
achieve realistic-looking continuous tone representations. 

A final complexity is introduced when dynamic events rather than static scenes 
are generated. Since realistic surface and shadowing algorithms are computationally 
complex, it becomes extremely expensive to generate 20 to 60 frames for each second 
of the event. Ray-tracing techniques, for example, are beyond the capabilities of 
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all but the most complex computer-graphics systems, and a single ray-traced frame 
can require hours of CPU time to generate. Nonetheless, these complex algorithms 
are often employed since adequate interpolation algorithms have not yet been 
developed. 


The Adequacy of Simulated Dynamics 

The final issue we consider is the adequacy of simulated dynamics in computer- 
generated animation. Presently, only the most simple dynamic events (e.g., collid- 
ing balls, rotating objects) are generated from mathematical motion models. Even 
these often make simplifying assumptions, such as the absence of friction or the use 
of particle, as opposed to solid body, mechanics. 

Consider, for example, biomechanical motions, such as those presented in point- 
light walker displays. Fully adequate mathematical models of biomechanical motion 
have yet to be developed, although progress is being made (Girard and Maciejewski, 
1985). At present, the most impressive examples of computer-animated biomechanical 
forms were created by techniques borrowed from the traditional animation arts. For 
example, rotoscoping (an animation technique developed at Disney Studio), has been 
employed to capture the dynamics of human motion. In rotoscoping, one first films a 
person performing the desired actions, then each film frame is used as a template to 
specify body and limb coordinate locations for each animation frame. Whereas such a 
technique produces impressive results for the cartoonist (e.g., Snow White) or the 
computer animator interested in special effects (e.g., Abel Graphic's metallic 
woman), it affords few advantages to the perceptual psychologist interested in the 
specification of, and observers' sensitivity to, biomechanical kinematics. 

An approach midway between rotoscope techniques and true mathematical motion 
models is the keyframe technique. Here, critical points of the event sequence are 
sampled, and intermediate coordinate positions are calculated based on assumed 
motion properties and constraints (Steketee and Badler, 1985). As yet, keyframe 
techniques cannot provide motion parameters that are sufficiently precise to be used 
in perceptual research. 

As indicated above, the algorithms used by perceptual psychologists in their 
dynamic simulations have been very reductionist even for relatively simple physical 
events (e.g., two objects colliding). There are good reasons for employing extreme 
simplifying assumptions: precise motion modeling of complex physical systems is a 

huge computational problem. The difficulty of developing adequate models of such 
systems may be better understood by examining the problems confronted by other 
disciplines which have attempted similar modeling, such as computational fluid 
dynamics (CFD). 

Computational Fluid Dynamics attempts to numerically model fluid and gas 
flows. This work has been most strongly pursued in studies of aerodynamics, but 
also has applications in streamlining, weather prediction, wake dynamics, and wind- 
loading studies (Kutler, 1983). At NASA (the second author's institution) the 
ultimate goal of CFD is to allow aerospace engineers to optimize designs solely 
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through computational models. At present, CFD modeling is complemented with wind- 
tunnel and flight-test evaluations. Convergence of CFD models with these laboratory 
and field results provides a basis for evaluating model adequacy. The advancement 
of CFD toward adequate models is seen as being paced by several constraints: the 

development of appropriate turbulence models; the ability to model dynamics in three 
dimensions instead of reducing the problem to a 2-D representation; and the develop- 
ment of more powerful computer architectures. 

Despite the computational complexity confronting CFD modeling (many 3-D models 
exceed the computational power of the Cray X-MP), there is the advantage that the 
criteria for model adequacy are well defined: a CFD model's performance should be 

functionally equivalent to that found for: (1) a physical aircraft in the wind 

tunnel and in the field, and (2) explicit analytic solutions where they exist. As 
might be expected, however, there is a notable lack of consensus among experts as to 
when equivalence is reached. 

The transportation of the CFD criteria for simulation adequacy to perceptual 
research would require us to demonstrate that our simulations and the corresponding 
natural events are perceptually equivalent. (We shudder at the thought of percep- 
tual psychologists attempting to agree on criteria for when such perceptual equiva- 
lence is achieved.) Ideally, we should "flight test" our computer-generated stimuli 
to determine whether their dynamics are discriminable from those seen in natural 
events. In practice, we tend to assume that the differences are irrelevant to the 
questions under study, and that findings with computer-generated stimuli will gen- 
eralize to natural events. 


CONCLUSION 


Most current programs of motion-perception research could not be pursued with- 
out the use of computer-graphics animation. Computer-generated displays afford 
latitudes of freedom and control that are almost impossible to attain through con- 
ventional methods. We think it important, however, tobe aware that computer simula- 
tions rarely, if ever, achieve a level of verisimilitude capable of causing an 
observer to confuse the simulation with reality. All of the limitations discussed 
above place constraints on the apparent realism of computer-animated displays. 

When viewing a computer-generated display, a dual awareness is always experi- 
enced: one has an awareness of both a 2-D contrived pattern and a projected 3-D 

event. Sensitivity to the dynamics manifest in the latter is almost surely influ- 
enced by the awareness of the former. We recommend caution in making unqualified 
generalizations about human sensitivities to natural events from studies on perceiv- 
ing computer-animated displays. Convergent investigations employing natural objects 
are always desirable, although, in practice, such studies are often extremely diffi- 
cult to conduct. 
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